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Abstract. In an experimental study of single enzyme reactions, it has been proposed 
that the rate constants of the enzymatic reactions fluctuate randomly, according to a 
given distribution. To quantify the uncertainty arising from random rate constants, it 
is necessary to investigate how one can simulate such a biochemical system. To do this, 
we will take the Gillespie's stochastic simulation algorithm for simulating the evolution 
of the state of a chemical system, and study a modification of the algorithm that incor- 
porates the random rate constants. In addition to simulating the waiting time of each 
reaction step, the modified algorithm also involves simulating the random fluctuation 
of the rate constant at each reaction time. We consider the modified algorithm in a 
general framework, then specialize it to two contrasting physical models, one in which 
the fluctuations occur on a much faster time scale than the reaction step, and the other 
in which the fluctuations occur much more slowly. The latter case was applied to the 
single enzyme reaction system, using in part the Metropolis-Hastings algorithm to enact 
the given distribution on the random rate constants. The modified algorithm is shown 
to produce simulation outputs that are corroborated by the experimental results. It is 
hoped that this modified algorithm can subsequently be used as a tool for the estimation 
or calibration of parameters in the system using experimental data. 



1. Introduction 

Gillespie's Stochastic Simulation Algorithm has recently gained popularly as a method 
for simulating the evolution of biochemical systems. Its advantage lies in its ability to 
capture the inherent stochasticity present in a chemical reaction system, and provide a 
full statistical description of the evolution of the system — a point not addressed by tra- 
ditional mass action theory, which, through a mathematical model of ordinary differential 
equations (ODEs), is able to capture only ensemble averaged behaviour and assumes a 
continuum of reactant concentrations. Despite the fact that the deterministic mass ac- 
tion theory and the stochastic model of Gillespie's algorithm are equivalent in the limit 
of large system sizes (both assuming the well-mixed assumption), it is widely accepted 
that the stochastic model is more appropriate for biochemical applications, for the rea- 
son that biochemical systems commonly involve very low numbers of reactant molecules. 
However, recent studies in the phenomenon of dynamic disorder of biomolecules reveal 
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further stochasticity in certain biochemical systems that has yet to be accounted for by 
either of the two models: biochemical systems with randomly fluctuating rate constants. 

Dynamic disorder refers to the fluctuation of the conformational state of a biomolecule, 
which may be attributed to the minimization of energy landscapes |15j . These fluctua- 
tions are also associated with causing the fluctuation of the reaction rate constant of the 
biomolecule due to its changed conformational states j23[ EI] • The effects of fluctuations 
due to dynamic disorder has been actively investigated by experimentalists [U H3J HU [Ml 
[T9l I12j . whilst models to describe this phenomenon, both on the molecular level and on 
the macroscopic kinetics level, have also been studied [IHl HU HTJ EJ EJ EUJ EH [25]. It is 
interesting to note that a subdiffusion of the conformational fluctuations based on frac- 
tional Brownian motion model was proposed by |18j . in which a key assertion was the 
molecule's long-range memory of its conformational fluctuations resulting in fluctuations 
on a broad range of time scales. In all the findings, it is widely realized that the effects 
of dynamic disorder on the reaction kinetics of biochemical reaction systems differ from 
the behaviour of systems where dynamic disorder is absent, and are not detectable by 
ensemble experiments or models. One case in point is shown in [T], where experiments 
on the /3-galactosidase enzyme reaction have uncovered interesting statistical properties 
- heavy-tailed behaviour and correlation in the product formation rates — which are 
not predicted by models with non-random rate constants. In particular, careful statistical 
analysis in [1] lead the authors to propose that the rate constants in this enzyme reaction 
system take on a continuum of values and has a stationary gamma distribution. 

In light of the findings in the single enzyme reaction, the purpose of this paper is to 
propose a modification of Gillespie's algorithm to allow for random rate constants to be 
incorporated into the simulation. Whilst we retain the idea of the original algorithm, which 
is to simulate a sample trajectory for the time evolution of the system by simulating the 
successive occurrences of every reaction in the system, we extend the underlying stochastic 
model to account for both the simulation of the waiting time between reactions, as well 
as the simulation of the time evolution of the fluctuating rate constant. In essence, the 
stochastic model for the evolution of the chemical system is a joint distribution (r, c) that 
models the interdependency between the waiting time r and the changing rate constant 
c. The joint distribution is a matter of modelling, thus would of course depend on the 
physical properties of the reactant molecules. However, what we propose here is a general 
framework that forms a starting point for how one might go about the modification of 
the Gillespie's algorithm. Subsequently, we specialize the general framework to the single 
enzyme example investigated in [1] , and illustrate a successful way to incorporate physical 
constraints on the slowly interconverting conformers into model for the joint distribution. 
It is here that the basic idea of Markov chain Monte Carlo is applied as a model to enforce 
the physical constraints, and numerical simulations from the algorithm thus derived are 
in good concert with the experimental results. 

Before proceeding, we note how this paper differs from other works relating to the 
Gillespie's algorithm. One major limitation of the algorithm is its slowness, thus one broad 
area of research focuses on speeding up the algorithm. Examples of these methods include 
the t- leaping method [5] , used to simulate more than one reaction at once, and the quasi- 
SSA and multiscale techniques, used for simulating stiff systems that possess reactions that 
occur on multiple time scales [U E[ E] • Simulations can also fully or partially employ the 
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use of the equivalent chemical master equation [271 [28]. Such speed-ups are not intended 
to change the fundamental stochastic properties of the chemical system, but typically 
to merely make approximations to the stochastic model or its chemical master equation 
to achieve better computational efficiency. Other works have studied more fundamental 
modifications of the system model, such as fluctuating rate constants of white noise type 
\26\ [2]. or biochemical systems possessing rate constants that vary with time as a result of 
external factors such as cell growth or temperature changes [TU1 HH E2] . However, to the 
best of our knowledge, no works have addressed the general issue of randomly fluctuating 
rate constants, particularly the kind exhibited in the presence of dynamic disorder where 
the rate constants vary over a wide range of time scales comparable to or slower than the 
time scale of the reactions. Finally, we remark that there seems to be no easy way to reduce 
the problem of dynamic disorder into the original framework of Gillespie's algorithm; even 
if the conformational changes were to be represented as a finite number of basic reactions 
in the chemical system, such a reduction is inadequate for reproducing the full dynamics of 
a system that actually has a continuous range of conformational states and rate constants 
(see pQ and its accompanying supplementary material). 

2. Deriving the modified SSA 

Before launching into the modifications of the Gillespie algorithm, we briefly review the 
ideas in the original algorithm. The Gillespie algorithm is derived from a probabilistic 
model of chemical reactions at the level of molecular interactions. Given a system of 
chemical reactions indexed by /i, of the form 

771 £i + • • • + r) r E r ^ Ci-Pi + ' ' ' + CpP P 
we associate with each reaction a constant with the property that 

c^St = average probability that a particular combination of 

reactant molecules will react accordingly in the next 
infinitesimal time interval St. 
The constant takes the interpretation in the SSA as the rate constant of the chemical 
reactiorQ- Denote by the stoichiometric coefficient representing the total number of 
possible combinations of R^ reactant molecules available. The propensity := h^c^ 
describes the rate for reaction R^ to occur, in the sense that a^St is the probability that 
reaction R^ will occur in the next infinitesimal time interval 5t. It is then shown that the 
waiting time r for the next reaction in the system to occur is exponentially distributed 
with rate a M and density 

(1) fr(r)= (E a ^e~ E ^ T - 

Additionally, a^/ '(X^ a A*) ls * ne probability that the reaction that occurred is R^. 

The SSA produces a sample trajectory starting from an initialized the reactant state 
space at time t = 0, followed by iterative simulation of subsequent reaction steps. Each 
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reaction step is simulated by generating a random number r from the density dU) and a 
random index f/, for the reaction then updating the reactant state space and time step 
accordingly. 

2.1. A framework for random rate constants in Gillespie's algorithm. Having 
been motivated to accommodate random rate constants into the SSA, we present a frame- 
work that allows us to extend the SSA to situations where the rate constant is a random 
variable. For ease of presentation, we first consider a single reaction scheme involving 
only one reactant. Subsequently, extending the derivation to reactions with more than 
one reactant, or to a system of reactions is straightforward. Thus, consider 



The rate constant c = c(t) is a continuous-time Markov process, which represents the time 
evolution of the fluctuating rate constant. Further properties of the process c(t) can be 
any physically relevant assumptions. In this paper, we are primarily interested to have c(t) 
possess a stationary distribution with density given by w(c). The stationarity of c(t) is a 
natural assumption in view of analogous steady state assumptions on the conformational 
state of a biomolecule [21] . The choice for the stationary density is a modelling issue that 
depends upon the reaction in question, or may be suggested from experimental data. For 
example, the gamma distribution was proposed in [I], 



with parameters a, b > and where T is the gamma function. 

Although the quantity c is a randomly fluctuating process and is no longer constant, we 
will continue to use the term 'rate constant' to refer to values that c(t) takes at a given 
time t. We will also refer to the distribution of c(t) as the 'rate distribution'. 

The starting point for the modification of the SSA is the idea that, given the rate 
constant at the time of the last reaction to be cq, we want to derive a model for the joint 
density f(r, c\\cq) of the waiting time r for the next reaction to occur and the rate constant 
ci at the time when the next reaction occurs. Upon elucidating a model for /(r, ci|co), 
the algorithm proceeds similarly to the original SSA in a reaction-stepwise fashion, except 
that now a pair of random numbers (r, ci) is drawn according to /(r, ci|co), and both the 
time step and rate constant must be updated. In this way, the algorithm sees only the 
rate constants Cj at reaction times, and ignores any underlying properties of the underlying 



Determining the joint density /(r, ci|co) is an issue of modelling, and should based 
on properties of the reactants and chemical system. In general settings, using a joint 
distribution is a central aspect of the modified algorithm because it captures a wide range 
of cases where r and c\ may or may not be correlated. One case is a scenario where the 

2 By writing /(r,ci|co), we are implicitly assuming the Markovian property for both the reaction system 
and the process c(t) . The former case can be justified if we assume a well-mixed reaction system, in the 
sense that each reacting particle's location is uniformly distributed within the reaction volume. The latter 
case is a modelling assumption that is made in this paper for simplicity and illustrative purposes. Under 
other compelling physical motivation, one may be compelled to develop a non-Markovian model for the 

CiS. 



(2) 




c 



(3) 
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process c(t) is modelled to evolve as an independent variable on which the distribution of 
t depends as a dependent variable; the other end of the spectrum are situations where 
(r, ci) are independent, given cq. However, a more delicate and interesting structure of 
interdependence between r and c\ could arise in scenarios where, on the one hand the 
fluctuations of c\ depend on the length of waiting time allotted, while on the other hand 
the waiting time depends on the dynamics of the rate constant. 

To make the problem of modelling the joint density more tractable, it is convenient to 
factorize the joint density /(r, ci|co) into conditional densities, 

(4) f T ,ci|co fc\\co ' fr\ci,co fr\ca ' fci\r,CQ- 

Here, we use / as a generic notation for a density. Either the first or second equality 
may be used for constructing the model. Suppose we are given a model for f T \ Co and 
fci\r,c - The algorithm then proceeds reaction-stepwise as shown in the following table. 
The algorithm works analogously if we know / cl i Co and f T \ Cl , Co instead. 



Suppose Co is the rate constant at the most recent reaction. To find the waiting 
time and rate constant (r, c) of the next reaction, 

(1) Pick r randomly according to f T \ CQ . 

(2) Given r and Co, pick c\ randomly according to / Cl | TjC0 . 

Update state space and rate constant c\. Progress the reaction time by r. 



2.2. Formulating the modified Gillespie's algorithm. In order to use the modifica- 
tion of the Gillespie's algorithm, one has to first construct a model based on one of the 
two conditional density factorizations in (J3|). In this paper, we will focus on the latter 
factorization. Thus, we consider the conditional density / Cl | TiCO (ci|r,co), a.k.a. the tran- 
sition kernel of c(t), to be a model for how c\ fluctuates over the waiting time interval 
of length r, given that it starts at cq. In general, the transition kernel / Cl | TiCo (ci|r, Co) 
should represent any physically relevant description of the c-dynamics. Such a description 
should be constructed to incorporate information about the stationary distribution, as 
well as the physical constraints on the dynamics of c, that for instance could arise systems 
where reactants exhibit slow interconversion between conformational states (e.g. pQ), or 
in systems with rapid fluctuation of rate constants (e.g. [2]). In Section [3j we will show 
a case study of slowly interconverting conformers in which we construct the transition 
kernel with help from the Metropolis-Hastings algorithm. However, for this section, we 
will assume that f Cl \ T ,c ( c i l r > c o) is given to us. 

Suppose /ci|t,c ( c i l r > c o) has been determined. We use the second equality in equation 
(HD to obtain 

/r,ci|c (' r i c l| c o) = f Cl \T,c a {ci\T,C ) ■ / T | co (r|co) 

(5) = / Cl | T , co (ci|r,co) • hp^r) • e~ h K «"*>W 

where 9? Co (t) = E ci (ci|r, Co) is the conditional expected value of c\ given (r, Co), and h is 
the stoichiometric number associated with the number of reactant molecules. In particular, 
we have derived in Appendix lAl that 

(6) /rloO-Ico) = h^ C0 {T)e- h K^'^'. 



6 



CHIA YING LEE 



This formula indicates that the effective propensity of the reaction is a-t(r) = hip CQ {r), in 
the sense that Jl^ +Sr a /J (r')dr / is the probability that the reaction occurs in the infini- 
tesimal time interval [t + r, t + r + St) . 

We observe that the distribution of r depends only on the transition dynamics of the 
process c(t) — specifically, it depends only on the conditional mean of c(t) in the waiting 
time interval. This is an important observation, because it simplifies the modelling de- 
mands to rest only on providing a model of f Cl \ T ,c - Once f cl \ T ,co 1S determined, the model 
for / T i co is automatically available, thereby reducing the complexity of the model. 

Since the cumulative distribution function of f T \ CQ is 

(7) F(t\c ) = l-e-^JiW-')^ 

sampling from the density / t i Co (t|co) can be achieved by inverting the expression 




hip Co (T')dT' = -logr 



for r, where r is a uniform random variable on [0,1]. In other words, defining $ Cf) (r) = 
Jo t Pc a {T')dT l , we find r by the transformation 

r = $- 1 (-ilog(l-r)). 

However, a closed form formula for may not always be readily available, except for 
special forms of the function ip co (r') (see e.g., [TO]), and hence numerical approximation 
procedures will have to be applied to compute r. This may become computationally 
intensive, but we will not discuss these computational issues here. 

2.3. Some equivalent formulations. 

Semi-Markovian approximations. Kou et al. [2] described a so-called semi-Markovian ap- 
proximation in which the fluctuation of the rate constant exhibits large variability from the 
time of one reaction to the next. This is characterized by rapidly fluctuating dynamics of 
the rate constant in a shorter time scale than the reaction waiting times. Thus, the approx- 
imation aspect of this model assumes the process c(t) to be characterized by infinitesimally 
small correlation lengths, so that the rate constant at the next reaction is independent of 
its value at the last reaction. This approximation leads to setting f cl \ TC0 ( c i\ T J c o) = w ( c i)- 
It is no surprise that the semi-Markovian approximation yields no correlation between 
successive reaction waiting times [2]. 

Time independent transition kernels. If f Cl \ T ,c does not depend on r, then we have / Cl | rjCo (ci 
/ci|c„( c il c o) and / r | cl|C0 (r|ci,c ) = / T | co (r|c ) = h(p CQ e~ ThiPc o . Then it is equivalent to use 
either the first or second equalities in Q. However, in general cases, it is not trivial to 
derive a formula for / r | Cl ,co( T l c i> °o)- 

Reduction to the original SSA.. Equation (jl]) reduces to Gillespie's original SSA with a 
non-random rate constant c by the special choice of w(c) = 5{c — c), where 5 represents 
the Dirac mass at 0. This leads to setting f Cl \r,c { c i\ T i c) = S(ci — c) in equation ([5]). It is 
clear that the algorithm will always pick c\ = c = cq almost surely. 
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Non-random but time varying rate constants. If the rate constant is a non-random function 
of time, c(t), we can still apply the framework without the stationarity assumption. The 
conditioning will be on Co at the current time t, and thus we set / Cl i T) t ;Co (ci |r, t, c(t)) = 
6(cx - c(t + r)). This case had been studied in [101 [TT1 122] . 

3. Single enzyme reactions - an example 

We illustrate an application of the modified Gillespie algorithm to the simulation of 
single enzyme reactions, en route proposing a method to construct a model for the tran- 
sition kernel / cl | rjCo (ci|r, Co) using the Metropolis-Hastings algorithm. Here, we consider 
a simple version of the enzyme reaction, where an enzyme E binds reversibly with the 
substrate S to form a complex ES, which then dissociates to release the product P. The 
reaction is given by the typified enzyme kinetics scheme 

fei 

(8) E + S ^Zt ES E + P 

fe-i 

where k± and k-\ are non-random rate constants associated with the original Gillespie 
algorithm, for the complex formation and dissociation reactions, and c is the random rate 
constant for the product formation reaction with density w(c). 

When w(c) is a non-random constant c, classical Michaelis-Menten kinetics provides a 
deterministic relation between the rate of product formation and the substrate concentra- 
tion, 

(9) d[P] _ *w[S] 



dt [5] + K M 

where v max = ([E] + [ES])c is the maximum enzyme velocity and Km = fc ~ fc 1+c is the 
Michaelis constant. Here and in future, the notation [E], [S], etc., denotes the concentra- 
tion of the reactant species E, S, etc. In the case of a single enzyme system, it is more 
appropriate when investigating the rate of formation of the product to consider the waiting 
time between the formation of successive products P, which we refer to as the turnover 
time r. The rate of product formation can be reformulated as 

(io) -i- = ~ clSi 

{ ' (t) [SI+Km 

where (r) is the ensemble average of r (see [TTJ O [2] ) . 

When dynamic disorder is present, the rate constant c fluctuates according to a distri- 
bution w(c). In this case, the rate of product formation in single enzyme reactions under 
a quasi-static condition of dynamic disorder has been shown to take an analogous form [2] 

(id — = m 

{ ' (r) [S] + C M 
where 

1 , r (fc-i + X) 

X = n — ) anci Cm = ; 

foo«^) dc ' kl 

are the harmonic mean of c and the effective Michaelis constant, respectively. 
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In the next section, we derive an algorithm to simulate the single enzyme reaction system 
using the method of modifying the SSA as described in the previous section. In addition 
to making a simple extension to multiple reactions, we propose a method to model the 
joint distribution f(r, c\) that can be applied quite generally. A major consideration in 
the modelling comes from the paper by English et al pQ, in which it was proposed that 
the single enzyme reaction exhibits dynamic disorder with the rate constant c distributed 
according to the gamme distribution ([3]). More importantly, the dynamic disorder was 
suggested to be a result of conformational fluctuations of the enzyme that occur at time 
scales much larger than product formation time lengths. 

3.1. Modelling slowly interconverting conformers. We focus our attention for the 
time being on the product formation reaction step, ES — I P + E. Equation (llip holds 



with the assumption that interconversion of conformers occur much slower than the reac- 
tion. This assumption is incorporated into the modified SSA by restricting the rate con- 
stant at the next time step, c±, to change by only a small amount between each reaction. 
Specifically, we use the conditional density as described in §2.21 and model / Cl | r ,co( c il T ' c o) 
to be a transition kernel whose support lies in an e-small interval around cq. As a first 
pass, we also make a simplifying assumption that / cl i r co (ci|r, Co) does not depend much 
on r, and consider in its place a conditional density / Cl | Co ( c il c o) whose support lies in a 
fixed e-small interval around cq. Alternatively, removing the simplifying assumption can 
be done by modelling the e-small interval to grow with r. 

Upon deciding on the support interval for f Cl \T,c ( c l\ T ) c o)> we then make use of the 
Metropolis-Hastings algorithm to construct a Markov chain for the successive values of c\ , 
as follows. Define the density 



(12) ff(ci|co) = j 

and define the acceptance probability 



if ci G (c - §,c + § ] 
otherwise 



w(ci) g(ci\c )\ . ( w(ci) 
ol\c\\cq) = mm I 1, — - — -— — j — - = mm 1 



w(c ) g(c \ci)J \ ' w(co) 

for |ci — co | < e/2. Then the transition probability for c\ does not depend on r and is 
given by 



(13) =5(ci|c )min 1 



/ Cl |c ( c il c o) = g(ci\c )a(ci\c ) 

w(ci) ' 
w(co). 

It is clear that the c\ generated in this way will lie in the interval (co — e/2, cq + e/2), and 
it is easy to check that / C i| co ( c il c o) is the transition kernel of a Markov chain with w(c) as 
its unique stationary distribution. That is, if we pick a random number r c from the forc- 
ing distribution g(ci\co), and accept it with probability a(r c \co) = min (l w ^ Ta > 9{rc\co) 



w(c Q ) g{c \r c ) 

we get that r c satisfies the transition kernel / Cl |i-,c ( c i l r > c o)i an d moreover r c is w(c) 
distribution provided cq is u»(c)-distributed. 
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Suppose Cq are the rate constants at the most recent reaction. 

(1) Pick a random number r and set r to be the time satisfying 

(14) rj2h^ dr' = -logr 

Jo v 

(2) For each i>, pick c\ from the density /^t | r c "( c i l T ) c o)- 

If f v (c v \r,(%) is of the form (USD, then' pick c u from g u (c u \T,c%), and 
accepts c u with probability a u {c v \r, Cq). 

(3) Determine the reaction to occur by picking reaction R u with probability 

Update the state space. 

Progress the reaction time by r and update the rate constant to Cq <— c\. 



Figure 1. Modified Gillespie Algorithm 



From ([5]), we automatically have a formula for the waiting time distribution 

/ T | C0 (r|c ) = h ES <p co e- h ^ So 

where h,ES, the stoichiometric number of the complex ES, is 1 if the enzyme is in complex 
form ES and if it is in free form E, and 



¥c = / ci/ cl i TC0 (ci \T,co)dci = / —mm 1, — — dc\. 

JO ' Jc -e/2 e V W{C )J 



he/2 c i w ( c iY 

'co-e/2 

Thus, the effective propensity of the product formation step is fup Co . 

Before moving on, it should be remarked that the use of the Metropolis-Hastings algo- 
rithm is solely as a way to construct a Markov chain possessing a stationary distribution. 
Although its original development was to simulate samples from a stationary distribution 
that is otherwise difficult to simulate from, this purpose plays no role in its application 
here. Quite the contrary, the stationary distribution may be an easily simulated distri- 
bution, as is the case in our example with the gamma distribution. Rather, any Markov 
chain can be used as a model, so long as it satisfies the physical constraints — a maximum 
range of fluctuations and a stationary distribution. The use of the Metropolis-Hasting 
algorithm so happens to be a convenient choice because it provides an easy way to specify 
the maximum range of fluctuations via the density g whilst maintaining the stationarity 
due to the acceptance probability a. And we will see in the simulation results that this 
modelling choice corroborates the experimental results. 

3.2. The modified SSA for single enzyme reactions. Finally, the Gillespie algorithm 
is adapted to integrate the procedure from the previous subsection into the algorithm for 
the entire enzyme reaction system which involves more than one reaction. In this reaction 
system, let #S be the number of substrate molecules, and He = 1 — h,ES- The propensities 
of the reversible enzyme-substrate complex formation reaction are hE{$S)k\, h-Esk-i 
and, given the current rate constant Co, the propensity of the product-forming reaction is 
h-ES^Pcd- The update quantities to be determined for each reaction step are the waiting 
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time r, the new rate constant c\ for the product formation reaction, as well as the reaction 
that occurs. 

The waiting time r to the next reaction is chosen to satisfy 



for a uniform random number r. Note that here, due to our assumption that g(c\ \cq) is 
time- independent, ¥>c o ( T is constant is r' and thus the evaluation of r is a direct algebraic 
evaluation. Next, the new rate constant c\ is chosen by picking c from the density g(c\co), 
and accepting c with probability a(c\co). If it is accepted, set c\ = c, otherwise it is 
rejected and set c\ = cq. Finally, the reaction that occurs is chosen in a standard way of 
comparing the relative propensities of the reactions at the time of the next reaction 
for the three reactions, complex formation, complex dissociation, and product formation 
reactions, the probabilities of their occurrence are proportional to 



respectively. This leads to the modified SSA for simulating slowly interconverting con- 
formers in single enzyme reactions. 

For an arbitrary chemical system of reactions R v with random rate constants drawn 
from the distribution f" v \ TC T and effective propensities h u ip^(r), the generalization of the 
modified Gillespie algorithm is obvious. Fig. [T]summarizes the algorithm for each iteration 
of the reaction step. 

3.3. Simulation results. To simulate a single enzyme system, we run the simulation 
starting with one enzyme molecule and #S = 60, 120, 300, 600 number of free substrate 
molecules. We assume a buffered substrate solution, that is, #S remains constant even 
if a complex formation reaction occurs. The rate constants were k\ = 50, = 18300, 
and c follows a gamma distribution with parameters a = 4.2, b = 220 in equation (|3"|) . 
The fluctuation of c is at most e = 50s -1 in (|12|) . which we fixed for all values of #S. 
With these parameters, each product formation occurs on average once in 30 complex 
dissociation reactions. 

While the simplifying assumption is limited, the simulation results nonetheless show 
several features similar to those obtained experimentally by English et al., [I]. One key 
feature is the heavy-tailed property of the distribution of the turnover time r, a phe- 
nomenon that is not observed if the rate constant is non-random. In the latter case, the 
distribution of the turnover time is known to exhibit exponential decay [2j. Figure [2] shows 
the heavy-tailed property in a histogram of turnover times when the modified algorithm 
was used (dots), compared against the exponential decay obtained from the model without 
dynamic disorder (crosses) with a non-random rate constant c = c = E ro [c]. For each value 
of #5, the heavy-tailed property is most apparent the rare regime of large turnover times. 

The crux of the work by English et al. is the discovery of correlations between successive 
turnover times of a single enzyme molecule, where they found that short turnover times 
are more likely to be followed by short turnover times, and vice versa. The top panel of 
Figure [3] shows the autocorrelation function of the turnover time series {r m }, computed 



(15) 




(16) 
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Histogram of the turnover waiting time 
(with fixed max fluctuation e = 50) 
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Figure 2. Histogram of turnover waiting times, normalized so that the 
relative frequency of the smallest time bin equals 1. The heavy-tailed be- 
haviour (large blue markers) due to dynamic disorder is juxtaposed against 
the exponentially light-tailed behavior in the absence of dynamic disorder 
(small pink markers). 



as 

and converting C T (m) to C T (t) using t = m(r). By construct, our model ensures that such 
positive correlations are produced, but that the systems containing different numbers of 
substrate molecules exhibit the same autocorrelation behavior shifted horizontally This 
is attributable to the fact that the variance of the forcing distribution g(c\co) governs the 
correlation between successive c, so that the same value of e applied for each value of #£* 
necessarily results in the same autocorrelation behavior. This result differs from English's 
experimental result, which show an increasing degree of autocorrelation for increasing 
substrate concentrations. Heuristically, the experimental results can be rationalized by 
noting that, when #S is low, the complex-forming reaction becomes rate limiting, wider 
fluctuations of c are attainable during the longer waiting time for the reaction, resulting 
in smaller correlation between turnover times. This heuristic indicates that the fixed e 
model is inadequate. 
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Fixed max fluctuation e = 50 
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Figure 3. Autocorrelation graph. The top panel shows the autocorrela- 
tion C T for the fixed e model in Section 13.14 the middle panel shows the 
autocorrelation for a model without dynamic disorder; the bottom panel 
shows the autocorrelation for the model with time-dependent c-dynamics 
in Section I3.41 
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3.4. Modelling time dependence in the c-dynamics. It is fair to say that the sim- 
plified model in Section 13.11 with fixed e is a poor model for the true dynamics of c. To 
improve the model, it becomes necessary to incorporate the time dependence in the c- 
dynamics. As a next step toward this goal, we consider a model in which the amount of 
fluctuation of the rate constant c depends on the length of the reaction waiting time. In 
place of (|12p . we use an interval that increases linearly with time, 

(17) !«-*.« + *) 

I U, otherwise 

As usual, the acceptance probability is given by a(c\\r, cq) = min (1, w ( Cl )g( Cl j T ' c o) j f or 



io(co)g(co|r,ci) j 

ci € (co — , Co + t^), and the transition probability is / Cl i TjC0 = <?(ci |r, co)a(a \t, co). The 
waiting time distribution, given in expression (J5j), is 

/.| co (r|c ) = ^ Co (r)e-' l i'o T ^ (r')^ 
where the effective propensity (p Co ( T ) now does depend on r, 

¥>c r = / ci/ C i|r, C0 « c i = / —mm 1, — — da. 

The algorithm in Diagram Q] is then applied to the model. 

One immediate difficulty with incorporating time-dependence into the Gillespie algo- 
rithm is that sampling the waiting time distribution involves inverting the equation (|14j) . 
In general, ip" (r) may be a complicated function and an explicit expression for the waiting 
time r is difficult to obtain. Consequently it becomes necessary to solve for r numerically, 
and this procedure becomes computationally intensive. In our simulations, assuming that 
r is sufficiently small, we solve for r by linearizing the LHS of (|14|) around 0. In this way, 
we speed up the algorithm albeit at the expense of admitting some error. 

Figure d] shows the histogram of the turnover times for increasing numbers of substrate 
molecules, #S = 60, 120,300,600. Here, we took e = 5 x 10 5 s" 2 . Similar to Figure the 
heavy tail behaviour is also observed. The key difference of the time-dependent model is 
seen in the autocorrelation graph in Figure El where the autocorrelation increases with 
increasing #S. Compared to the fixed e model, the autocorrelation behaviour is in closer 
agreement qualitatively with the experimental results of English et al. 

As suggested by equations (fTUj) and (fTT|) . the Lineweaver-Burke plots (Fig. [5]) obtained 
for the models with and without dynamic disorder reveal the linear relationship between 
the mean turnover time (r) and the inverse substrate concentrations. The slope and y- 
intercept for the linear best fits for this data are shown in the following table. Also shown 
are the estimated values of \ an d Cm for the models with dynamic disorder, and true 
and estimated values of c and Km for the model without dynamic disorder. The time- 
dependent c-dynamics model gives a more accurate estimate of Cm, though the estimate 
of x is still not accurate. 

3.5. Computational efficiency. The modified algorithm requires four random number 
draws per reaction step, as compared to two for the original Gillespie algorithm. Given the 
added complexity of the model, this is not an excessively large computational burden. Also, 
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Histogram of the turnover waiting time 
(with max fluctuation ex, e=5x10 5 ) 
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Figure 4. Histogram of turnover waiting times, normalized so that the 
relative frequency of the smallest time bin equals 1. The time-dependent 
c-dynamics model (large blue markers) and the model without dynamic 
disorder (small pink markers) are shown. 





Fixed e model 


Time-dependent 
c-dynamics model 


W/o dynamic 
disorder 


Slope 


0.4389 


0.4219 


0.41861 


y-intercept 


0.0009637 


0.001095 


0.001088 


Estimated x or c 


1037.7 


913.5 


918.8 


Estimates Cm or Km 


455.4 


385.4 


384.9 


True x or c 


704 


704 


924 


True Cm or Km 


380.1 


380.1 


384.5 



the modified algorithm suffers from the same issue of computational efficiency that the 
original Gillespie algorithm faces, due to have to evolve the system reaction by reaction. 
Techniques to speed up the computing time, such as the next reaction method, can be 
applied to improve the computational efficiency in this modification of the algorithm, and 
will be the subject of future work. However, the largest computational cost arises from 
having to invert equation (|14|) to find r. Approximations of r, like the linearization done 
in Section (|3.4p . may be employed if the error can be properly quantified. 
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Figure 5. The Lineweaver-Burke plot shows the linear relationship be- 
tween the inverse of the substrate amount and the mean turnover waiting 
time, for the cases with and without dynamic disorder. 



4. Discussion and Conclusion 

We have provided a framework for the modified Gillespie algorithm to address the 
problem of stochastic simulation of biochemical systems possessing dynamic disorder. Al- 
though the modelling and implementation shown in the single enzyme reaction examples 
leave much room for refinement, it nonetheless is able to pinpoint some concrete modelling 
ingredients that are corroborated by experimental data, and is a versatile method that 
can be adapted to many model dynamics. This gives us a good indication of the direction 
that further modelling efforts can take. With this framework, model calibration against 
real data will be possible. 
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Appendix A. Derivation of f(r\co) 

In this appendix, we derive the formula for the conditional probability /(r|co), assuming 
that we have available a model for the transition kernel f Cl \r,c ( c i l r > c o) f° r the process c(t). 

Suppose that at time t the rate constant is cq. Let Pq{t\cq) be the conditional probability 
that no reaction occurs within the next r time. Then the conditional probability that the 
next reaction occurs in the infinitesimal time interval [t + r, t + r + St) is approximately 
fr\c {T\c )5T, and 

fr\c (-r\co)ST 

= Po( r l c o) x Pr(reaction occurs in [t + r, t + r + 5t)\cq) 

For an arbitrary r, recalling the transition kernel / cl | T)Co (ci |r, cq) for the underlying process 
c(t), we condition on the value of c T = c(t + r) at time t + r, 

Pr (reaction occurs in [t + r, i + r + <5r)|co) 

= / Pr(reaction occurs in [t + r, t + r + St)\c t , cq) • fc T \T,c (c T \t, co)dc T 
Jr+ 

= / hc T 5Tf CTlrCo (c T \T,c )dc T 

= hip CQ (T)5T 

where (fc ( T ) = E Cl (ci|r, co). In the event that the reaction occurs in [t + r, i + r + <5r), 
we will have that ci = c r . 
To find Pq(t\co), 

P (t + 5r|c ) 

= Po( r |co) x Pr(No reaction occurs in [t + r, t + r + <5r)|co) 
= PoO|c ) x (1 - hip C0 {T)5T) 
So Po( r l c o) satisfies a differential equation 

dlogPo , , v 

with the initial condition P (t = 0|c ) = 1. Hence, Pq(t\cq) = e~ h h ^ c o( T ') dr ', and 

/ r | Co (r|co) = V co (r)e-^o r ^o(^r'_ 
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